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(57) ABSTRACT 

An agent mavjtecou pled t oreceiyea clock s ignal associated 
with the - bus, and may be f confi gured to_dri ye a s ignal 
responsive_tQ_a_fi r st_edge. (rising or fallin g ) of t he clock 
signal and to sample signals responsive to the second edge. 
The sampled signals may be evaluated to allow for the 
driving of a signal on the next occurring first edge of the 
clock signal. By using the first edge to drive signals and the 
second edge to sample signals, the amount of time dedicated 
for signal propagation may be one half clock cycle. Band- 
width and/or latency may be positively influenced. In some 
embodiments, protocols which may require multiple clock 
cycles on other buses may be completed in fewer clock 
cycles. For example, certain protocols which may require 
two clock cycles may be completed in one clock cycle. In 
one specific implementation, for example, arbitration may 
be completed in one clock cycle. Request signals may be 
driven responsive to the first edge of the clock signal and 
sampled responsive to the second edge. The sampled signals 
may be evaluated to determine an arbitration winner, which 
may drive the bus responsive to the next occurrence of the 
first edge. 

35 Claims, 6 Drawing Sheets 




03/21/2004, EAST Version: 1.4.1 



US 6,678,767 Bl 

Page 2 



OTHER PUBLICATIONS 

Jim Keller, "The Mercurian Processor: A High Performance, 
Power-Efficient CMP for Networking," Oct. 10, 2000, 22 
pages. 

Tom R. Halfhill, "SiByte Reveals 64-Bit Core For NPUs; 
Independent MIPS64 Design Combines Low Power, High 



Performance," Microdesign Resources, Jun. 2000, Micro- 
processor Report, 4 pages. 

Halfhill, "SiByte Reveals 64-Bit Core for NPUs/' Micro- 
processor Report, Jun. 2000, pp. 45-48. 
Pentium® Pro Family Developer's Manual vol. 1; Speci- 
fications, Chapter 3, pp. 1-25, 1996. 
* cited by examiner 



03/21/2004, EAST Version: 1.4.1 



U.S. Patent Jan. 13, 2004 Sheet 1 of 6 



US 6,678,767 Bl 



System 10 



Processor 




Processor 




L2 Cache 




Memory 
Controller 
16 


12A 




12B 




14 


— ► 










i 


i 










r 




r 




f 






r 



Arbitration 28 



Address 30 



Response 32 



Data 34 



CLK 



36 



Bus 24 



I/O Bridge 
20A 



l/OBridge 
20B 



I/O 
Interface 
22A 



I/O 
Interface 
22B 



I/O 
Interface 
22C 



1 



I/O 
Interface 
22 D 



Memory 
26 



Fig. 1 



03/21/2004, EAST Version: 1.4.1 



U.S. Patent Jan. 13, 2004 Sheet 2 of 6 US 6,678,767 




03/21/2004, EAST Version: 1.4.1 



U.S. Patent Jan. 13, 2004 Sheet 3 of 6 



US 6,678,767 Bl 




03/21/2004, EAST Version: 1.4.1 



U.S. Patent 



Jan. 13, 2004 



Sheet 4 of 6 



US 6,678,767 



A_Req[7:0] * ► 

D_Req[7:0] * ► 

Block[7:0] + ► 

28 

Fig. 4 



Addr[39:5] < 

A_ID[9:0] < 

A_CMD[2:0] < 

A_BYEN[31 :0] < 

A_ATTR[n:0] < 

30 — ^ 

Fig. 5 



03/21/2004, EAST Version: 1.4.1 



U.S. Patent Jan. 13, 2004 Sheet 5 of 6 US 6,678,767 



R_SHD[5:0] 



R_EXC[5:0] 




Fig. 6 



Data(255:0) 
D_ID[9:0] 
D_RSP[3:0] 
D_Code[2:0] 
D Mod 



03/21/2004, EAST Version: 1.4.1 




Fig. 7 



U.S. Patent Jan. 13, 2004 Sheet 6 of 6 US 6,678,767 



Addr[39] 



Addr[39] 



y- 110B 



r 112A 

Addr[38] ^ 



Addr[38] 



112B 



Fig. 8 



Carrier Medium 
300 



System 10 



Fig. 9 



03/21/2004, EAST Version: 1.4.1 



US 6 ? 678,767 Bl 

1 2 

BUS SAMPLING ON ONE EDGE OF A By using the first edge to drive signals and the second 

CLOCK SIGNAL AND DRIVING ON edge to sample signals, the amount of time dedicated for 

ANOTHER EDGE signal propagation may be one half clock cycle. Bandwidth 

and/or latency may be positively influenced. In some 
s embodiments, protocols which may require multiple clock 

BACKGROUND OF THE INVENTION cycles on other buses may be completed in fewer clock 

. cycles. For example, certain protocols which may require 

1. Meld ot the Invention (wo dock cycks may be completed in one clock cyde In 

This invention is related to digital systems and, more one specific implementation, for example, arbitration may 

particularly, to buses within digital systems. be completed in one clock cycle. Request signals may be 

2. Description of the Related Art driven responsive to the first edge of the clock signal and 
Abus is frequently used in digital systems to interconnect sampled responsive to the second edge. The sampled signals 

a variety of devices included in the digital system. Generally, ma y be evaluated to determine an arbitration winner, which 

one or more devices are connected to the bus, and use the ^ dnve the bus responsive to the next occurrence of the 

bus to communicate with other devices connected to the bus. 1 gc " .„ . , t 

As used herein, the term "agent" refers to a te vice-whioh is In one specific implementation, the data bus may be sized 

ca pable of communicatin g on the bus. Hie agent may be a t0 fora sin S Ie ^ cle da ' a transfer for even the lar S est 

requesting agent if the a gent is capa ble oflmuiTi5gl5ns- sized data that ma y be transferred in one transaction. For 

acS^^ exam P le > ^ databus mav be sized t0 traD L sfe [ a cacl f block 

is"^pj Me of respondin g to a t ransaction initiated bv _a 2n P er clock cvcle ' In one implementation, the bus and agents 

re Q^estjnT^ ^ agent may be capable of being ma y be integrated onto a single integrated circuit. Since the 

both a requesting agent and a^tesaoading agent. bus is internal to the integrated circuit, it may not be limited 

Ao^itlo^aU^^ b y the number tf pins which may be available on the 

The transaction may include an addrelslr^sfer anloption- integrated circuit. Such an implementation may be particu- 

ally a-o^tTtraf^feTlTra^ „ larl y suited to a data bus sized to aIlow sin S le c y cle data 

(transfers"oTdata from the responding agent to the request- transfen Additionally, differential pairs may be used for each 

ing agent) and write transactions (transfers of data from the si § nal or a subset of the bus sl S nals - Differential signal may 

requesting agent to the responding agent). Transactions may enhancc rae frequency at which the bus may operate, 

further include various coherency commands which may or In one particular implementation, the bus may support 

may not involve a transfer of data. 30 coherency and out of order data transfers (with respect to the 

The bus is a shared resource among the agents, and thus °^ dc ^ of ^ add ^ ess ™° bus ™W ^PP 0 * *&BW 

may affect the performance of the agents to the extent that of address and data phases, for example, to match address 

the bus may limit the amount of communication by each and corresponding data phases. 

agent and the latency of that communication. Generally, a Broadly speaking, a system is contemplated comprising a 

bus may be characterized by latency and bandwidth. The 3 5 ^sand^^entc^^ a clock 

latency may be affected by the amount of time used to signd lonhebus. The clock signal has a rising edge and a 

arbitrate for the bus and to perform a transaction on the bus. fallin g ed S e dunn S ^e. The agent is configured to dnve one 

The bandwidth may be affected by the amount of informa- or more si § nals on the hus responsive to a first edge of the 

tion (e.g. bits or bytes) that may be transmitted per cycle, as rising ed S e or the fallin g ed g e > and * further configured to 

well as thTamouht of time used to perform the transfer. Both 4 q samplc a value on 46 bus res P onsive to a sccond ed e e of the 

latency and bandwidth may be affected by the physical ed S e or the falling edge. 

constraints of the bus and the protocol employed by the bus. Additionally, a method is contemplated. A value is driven 

For example, many bus protocols require two clock cycles on a bus responsive to first edge of a rising edge or a falling 

for arbitration: the transmission of the requests for the bus ed S e of a clock si S nal for the bus * A ™ lue 15 sampled from 

during the first clock cycle and the determination of the grant 45 thc bus responsive to a second edge of the nsing edge or the 

(and transmittal of the grant, in a central arbitration scheme) tailing edge. 

during the second clock cycle. The transaction may be BRIEF DESCRIPTION OF THE DRAWINGS 

initiated by the agent receiving the grant during the third mher objects and advant of the invention win 

c ock cycle The clock cycles may each be a period of a become readi ^ followi detailed 

clock signal associated with the bus Similarly most bus 50 descri tion and reference to the accompanying draw . 

protocols are limited in the number of bytes of data which ■ s - n wn j CD . 

may be transferred per clock cycle (e.g. 8 bytes is typical). CT « -, ■ ,\ i r UJ - . * 

* , . c • I t_i i rj * / u- u* I FIG. lis a block diagram of one embodiment of a system. 

Accordingly, transfernng a cache block of data (which tends . . . 

to dominate the transfers performed in modern digital . VlQ ' 2 is a timing diagram illustrating transmission of 

systems) requires multiple clock cycles (e.g. 4 clock cycles 55 signals on one embodiment of a bus within the system 

for a 32 byte cache block on an 8 byte bus). shown in HG ' 1 

FIG. 3 is a timing diagram illustrating several exemplary 

SUMMARY OF THE INVENTION bus transactions. 

The problems outlined above are in large part solved by FIG. 4 is a block diagram illustrating exemplary signals 

a system including one or more agents coupled to a bus. The 60 which may be included in one embodiment of an arbitration 

agent may be coupled to receive a clock signal associated portion of a bus. 

with the bus, and may be configured to drive a signal FIG. 5 is a block diagram illustrating exemplary signals 

responsive to a first edge (rising or falling) of the clock which may be included in one embodiment of an address 

signal and to sample signals responsive to the second edge. bus. 

The sampled signals may be evaluated to allow for the 65 FIG. 6 is a block diagram illustrating exemplary signals 

driving of a signal on the next occurring first edge of the which may be included in one embodiment of an response 

clock signal. portion of a bus. 
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FIG. 7 is a block diagram illustrating exemplary signals 
which may be included in one embodiment of a data bus. 

FIG. 8 is a block diagram illustrating differential pairs of 
signals which may be used in one embodiment of a bus. 

FIG. 9 is a block diagram of a carrier medium. 5 

While the invention is susceptible to various modifica- 
tions and alternative forms, specific embodiments thereof 
are shown by way of example in the drawings and will 
herein be described in detail. It should be understood, 
however, that the drawings and detailed description thereto 10 
are not intended to limit the invention to the particular form 
disclosed, but on the contrary, the intention is to cover all 
modifications, equivalents and alternatives falling within the 
spirit and scope of the present invention as defined by the 
appended claims. 

DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENTS 

Turning now to FIG. 1, a block diagram of one embodi- 2Q 
ment of a system 10 is shown. Other embodiments are 
possible and contemplated. In the embodiment of FIG. 1, 
system 10 includes processors 12A-12B, an L2 cache 14, a 
memory controller 16, a pair of input/output (I/O) bridges 
20A-20B, and I/O interfaces 22A-22D. System 10 may 25 
include a bus 24 for interconnecting the various components 
of system 10. More particularly, as illustrated in FIG. 1, bus 
24 may include arbitration lines 28, an address bus 30, 
response lines 32, a data bus 34, and a clock line or lines 36. 
As illustrated in FIG. 1, each of processors 12A-12B, L2 30 
cache 14, memory controller 16, and I/O bridges 20A-20B 
are coupled to bus 24. Thus, each of processors 12A-12B, 
L2 cache 14, memory controller 16, and I/O bridges 
20A-20B may be an agent on bus 24 for the illustrated 
embodiment. More particularly, each agent may be coupled 35 
to clock line(s) 36 and to the conductors within bus 24 that 
carry signals which that agent may sample and/or drive. I/O 
bridge 20 A is coupled to I/O interfaces 22A-22B, and I/O 
bridge 20B is coupled to I/O interfaces 22C-22D. L2 cache 
14 is coupled to memory controller 16, which is further 4Q 
coupled to a memory 26. 

Bus 24 may be a split transaction bus in the illustrated 
embodiment. A^lit-lEaa saction bus_split s_the address-and 
data430t&ojis_o£je_ach 

portion (referred to as the address phase) and the data 45 
portion (referred to as the data phase) to jaroc eed indepen- 
dently. In the illustrated embodiment, the address bus 30~and 
data bus 34 are independently arbitrated for (using signals 
on arbitration lines 28). Each transaction including both 
address and data thus includes an arbitration for the address 50 
bus 30, an address phase on the address bus 30, an arbitra- 
tion for the data bus 34, and a data phase on the data bus 34. 
Additionally, coherent transactions may include a response 
phase on response lines 32 for communicating coherency 
information after the address phase. Additional details 55 
regarding one embodiment of bus 24 are provided further 
below. The bus clock signal CLK on clock line(s) 36 defines 
the clock cycle for bus 24. 

Bus 24 may be pipelined. Bus 24 may employ any 
suitable signalling technique. For example, in one 60 
embodiment, differential signalling may be used for high 
speed signal transmission. Other embodiments may employ 
any other signalling technique (e.g. TTL, CMOS, GTL, 
HSTL, etc.). 

Processors 12A-12B may be designed to any instruction 65 
set architecture, and may execute programs written to that 
instruction set architecture. Exemplary instruction set archi- 
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lectures may include the MIPS instruction set architecture 
(including the MIPS-3D and MIPS MDMX application 
specific extensions), the IA-32 or IA-64 instruction set 
architectures developed by Intel Corp., the PowerPC instruc- 
tion set architecture, the Alpha instruction set architecture, 
the ARM instruction set architecture, or any other instruc- 
tion set architecture. 

L2 cache 14 is a high speed cache memory. L2 cache 14 
is referred__to as "L2" since processors 12A-12B may 
emplo^mtefn aNevel XJ ZLA?) caches. If LI caches are not 
mcludetl4n-pro?essors 12A-12B7E2*cache 14 may be an LI 
cache. Furthermore, if multiple levels of caching are 
included in processors 12A-12B, L2 cache 14 may be an 
outer level cache than L2. L2 cache 14 may employ any 
organization, including direct mapped, set associative, and 
fully associative organizations. In one particular 
implementation, L2 cache 14 may be a 512 kilobyte, 4 way 
set associative cache having 32 byte cache lines. A set 
associative cache is a cache arranged into multiple sets, each 
set comprising two or more entries. A portion of the address 
(the "index") is used to select one of the sets (i.e. each 
encoding of the index selects a different set). The entries in 
the selected set are eligible to store the cache line accessed 
by the address. Each of the entries within the set is referred 
to as a "way" of the set. The portion of the address remaining 
after removing the index (and the offset within the cache 
line) is referred to as the "tag", and is stored in each entry 
to identify the cache line in that entry. The stored tags are 
compared to the corresponding tag portion of the address of 
a memory transaction to determine if the memory transac- 
tion hits or misses in the cache, and is used to select the way 
in which the hit is detected (if a hit is detected). 

Memory controller 16 is configured to access memory 26 
in response to memory transactions received on bus 24. 
Memory controller 16 receives a hit signal from L2 cache 
14, and if a hit is detected in L2 cache 14 for a memory 
transaction, memory controller 16 does not respond to that 
memory transaction. If a miss is detected by L2 cache 14, or 
the memory transact ion is non-cacheable, memory control- 
ler 16 mav^ccessmemory 2^to^r^rfonn"thTl:ead-or~write 
operation. Memory controller 16ln^ be^designed to access 
any of a variety of types of memory. For example, memory 
controller 16 may be designed for synchronous dynamic 
random access memory (SDRAM), and more particularly 
double data rate (DDR) SDRAM. Alternatively, memory 
controller 16 may be designed for DRAM, Rambus DRAM 
(RDRAM), SRAM, or any other suitable memory device. 

I/O bridges 20A-20B link one or more I/O interfaces (e.g. 
I/O interfaces 22A-22B for I/O bridge 20A and I/O inter- 
faces 22C-22D for I/O bridge 20B) to bus 24. I/O bridges 
20A-20B may serve to reduce the electrical loading on bus 
24 if more than one I/O interface 22A-22B is bridged by that 
I/O bridge. Generally, I/O bridge 20A performs transactions 
on bus 24 on behalf of I/O interfaces 22A-22B and relays 
transactions targeted at an I/O interface 22A-22B from bus 
24 to that I/O interface 22A-22B. Similarly, I/O bridge 20B 
generally performs transactions on bus 24 on behalf of I/O 
interfaces 22 C— 22 D and relays transactions targeted at an 
I/O interface 22C-22D from bus 24 to that I/O interface 
22C-22D. In one implementation, I/O bridge 20Amay be a 
bridge to a PCI interface (e.g. I/O interface 22 A) and to a 
Lightning Data Transport (LDT) I/O fabric developed by 
Advanced Micro Devices, Inc (e.g. I/O interface 22B). Other 
I/O interfaces may be bridged by I/O bridge 20B. Other 
implementations may bridge any combination of I/O inter- 
faces using any combination of I/O bridges. I/O interfaces 
22A-22D may include one or more serial interfaces, Per- 
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sonal Computer Memory Card Internationa] Association [n one embodiment, bus 24 may employ differential pairs 

(PCMCIA) interfaces, Ethernet interfaces (e.g. media access of lines for each signal. Each line may be precharged, and 

control level interfaces), Peripheral Component Intercon- then one of the lines may be driven to indicate the bit of 

nect (PCI) interfaces, LDT interfaces, etc. information transmitted on that line. For such embodiments, 

It is noted that system 10 (and more particularly proces- s the signals may be precharged between the falling edge of 

sors 12A-12B, 12 cache 14, memory controller 16, I/O the clock signal CLK and the next rising edge of the clock 

interfaces 22A-22D, I/O bridges 20A-20B and bus 24) may s j gna i CLK (illustrated by arrow 44). Thus, the agent driving 

be integrated onto a single integrated circuit as a system on thc sigQal may disable its drivers responsive to the falling 

a chip configuration. In another configuration, memory 26 edge of the clock s i gna i C LK. In one specific 

may be integrated as well Alternatively, one or more of the 10 implementation , the agent driving the signal may disable its 

components may be implemented as separate integrated drivef afler a predetermined delay to avoid a raC e condition 

circuit, or all components may be separate integrated ^ ^ n of ± &[ R Qne f ^ be 

circuits, as desired. Any level of integration may be used. , a , . r iU T 

T . , , , „ J , , ■ , denned to perform the precharge, or a separate circuit (not 

It is noted that, while the Ulustated embodiment employs shown) {olm ^ h Alternatively, the agent 

a split transaction bus with separate arbitration for the , . . ' . , r - *? , " ° 

, / ■ j . * , i i_ j ■ * i 15 driving the signal may perform the precharge. 

address and data buses, other embodiments may employ 

non-split transaction buses arbitrated with a single arbitra- Smce sl S nals are dnven responsive to one edge of the 

tion for address and data and/or a split transaction bus in clock si ^ 1 ™ 6 sam P led responsive to the other edge, the 

which the data bus is not explicitly arbitrated. Either a latency for performing a transaction may be reduced, 

central arbitration scheme or a distributed arbitration scheme Generally, the clock cycle may be divided into a drive phase 

may be used, according to design choice. 20 * nd an evaluate phase. During the dnve phase, signals are 

It is noted that, while FIG. 1 illustrates I/O interfaces driven. Those driven signals are sampled at the end of the 

22A-22D coupled through I/O bridges 20A-20B to bus 24, drive phase and, during the evaluate phase those driven 

other embodiments may include one or more I/O interfaces sl S nals are evaluated to determine if the sampling agent is to 

directly coupled to bus 24, if desired. 25 V^ 0 ™ ™ actl0D wth rcs P ect to the information transmit- 

Turning next to FIG. 2, a timing diagram is shown 

illustrating transmission and sampling of signals according For example, ar£nxati Q iunay^e-compk^^ 

to one embodiment of system 10 and bus 24. Other embodi- c V cle > according to one embodiment T^questsignals for 

ments are possible and contemplated. The clock signal on eacjLagejgXregu^ngJh^ 

clock line(s) 36 is illustrated (CLK) in FIG. 2. The high and 30 th iJS?g^ Duri °S the 

low portions of the clock signal CLK are delimited with remaimngportion of the clock cycle, the request signals may 

vertical dashed lines be^e yafuate di to determine a winner of the arbitration. The 

Generally, the clock signal CLK may have a rising edge drive^b^ As 

(the transition from a low value to a high value) and a falling ^ trated f f G ' 2 > address **» ratl ° n request signals may 

edge (the transition from a high value to a low value). The 35 be *T TTt f 1 F { ™ 

signals on bus 24 may be driven responsive to one of the numeral 48 > m tne ^ illustrated clock cycle. The winning 

edges and sampled responsive to the other edge. For a S ent mav dn y e ™ addre f P or * lon of a ^ansaction during 

example, in the illustrated embodiment, signals may be the subsequent clock cycle (reference numeral 50). Other 

driven responsive to the rising edge and sampled responsive arbitrating agents may determine that they did not win, and 

to the falling edge. Hius, signals propagate on bus 24 during 40 t ^ us 1 mav d "ve request signals again dunng the subsequent 

the time between the rising edge and the falling edge of the ciock c y cle Reference numeral 52). 

clock signal, and sampled signals may be evaluated between Agents myoly^in^coheren cy may sample the add ress 

the falling edge and the rising edge of the clock signal. One driverrrjffoejwjrm^ 

or more signals on the bus may be driven with a value, and the^vahiate phase, t he agen ts may determine if the tran s- 

that value may be sampled by an agent receiving the signals. 45 action-is-a-coher enrtransaction, andlhuslh^ KTa geWar e 

More particularly, as illustrated by arrow 40, an agent ^to-^g^r^aress. AdditionaUyTtnTevaluate phase and 

which has determined that it will drive a .signal nr signals Vihc-su&qucnt clock cyckmay ^usedjo^dejtermme the 

durin^jLc^-cycle^a^ dri ver for eacL such snoop_result-, which may be dnven in the response phase 

signaU^p^iy^the^rismg edge of the clock signal. For (reference numeral 56) and evaluated by the agent dnving 

example, an agent may logicall^ANDlhT5l5BI^fgnal CLK 50 the address ( reference numeral 58). 

with an i nternally generated sig nal indicati ng that a sign al is Data bus arbitration may be similar, as illustrated by 

to ^driyTnAO-prad ucra^^ a driver on the reference numerals 60-70. More particularly, data arbitra- 

signal (if the enable signal is asserte~dTugE)\ Other embodi- tion request signals may be driven (reference numeral 60) 

ments may employ other logic circuits to produce the enable, and evaluated (reference numeral 62) in the first illustrated 

depending on whether the enable is asserted high or low and 55 clock cycle. The winning agent may drive a data portion of 

whether the internally generated signal is asserted high or a transaction during the subsequent clock cycle (reference 

low. Furthermore, the clock signal CLK may be logically numeral 64). Agents which receive data may sample the 

ORed with a delayed version of the clock signal CLK to add data, and may evaluate the data (reference numeral 70). For 

hold time to avoid race conditions with the sampling of the example, in embodiments which provide tagging to allow 

signal at the falling edge of the clock signal CLK, as desired. 60 for oul of orde r data transfers, the tags may be compared to 

As illustrated by arrow 42, agents may sample signals tags that the agent is awaiting data for to determine if the 

responsive to the falling edge of the clock signal. For a gent should capture the data. Other arbitrating agents may 

example, agents may employ a senseamp (e.g. for differen- determine that they did not win, and thus may drive request 

tial signalling), flip flop, register, latch, or other clocked signals again during the subsequent clock cycle (reference 

device which receives the clock signal CLK and captures the 65 numeral 68). 

signal on the line responsive to the falling edge of the clock As used herein, the term "drive", when referring to a 

signal CLK. signal, refers to activating circuitry which changes the 
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voltage on the line carrying the signal, to thereby transmit a phases, the data phases may overlap with the address phases, 

bit of information. The term "sample", when referring to a In one embodiment, the data phase of a given transaction 

signal, refers to sensing the voltage on the line carrying the may begin at any time after the address phase, 

signal to determine the bit of information conveyed on the FIG 3 ^ Qlustrates the out of order features of one 

signal. Hie term precharge refers to setting the voltage on s embo diment of bus 24. While the address phases of the three 

a line to a predetermined value pnor to the time that the line 4 4 . * « . j .u i*» *u .u 

i j • mi j * • j i i_ i transactions occur in a first order (Tl, then T2, then T3), the 

may be dnven. The predetermined value may be a supply , . , . 4 v , \™ iL ~~ iL ' ™ 

„r „ a /i a i* r m i data phases occur in a difterent order (T2, then IT, then T3 

(high) voltage or a ground (low) voltage, for example. f . . v , ' ' 
tTtu-i * l u j- h . * j i ■ m this example). By allowing out of order data phases with 
While the above discussion illustrated an example in . . j r iL j- jj . 
which signals are driven responsive to the rising edge of the in respect to the order of the corresponding address phases 
clock signal CLK and sampled responsive to the falling 10 bandwidth utilization , may be high. Each responding agent 
edge, an alternative embodiment is contemplated in which ma y «*>itrate »* the data bus once it has determined that the 
signals may be driven responsive to the falling edge and data 18 read y to be transferred. Accordingly, other agents 
sampled responsive to the rising edge. ( e -8- lower latenc y a S ents ) ma y transfer data for later trans- 
Turning next to FIG. 3, a timing diagram is shown „ f ctions °* of order > bandwidth while the higher 
illustrating several exemplary transactions which may be 15 a 'ency but earlier initiated, transaction experiences its 
performed on one embodiment of bus 24. Other embodi- latenc y- Generally, any two transactions may have their data 
ments are possible and contemplated. In FIG. 3, clock cycles P nase f, Panned «* "^r with their address phases, 
are delimited by vertical dashed lines and labeled (CLK 0, rogpdka of whether the two transactions are initiated by 
CLK 1 etc ) at the top. same re£ l ueslin g agent or different requesting agents. 

FIG. 3 mustrates^u^jn ^on the bus accord ing to one 2 ° In one embodiment, bus 24 may include tagging for 

embodiment of the TfiSTBunng clock cycle CLK 0, the identifying corresponding address phases and data phases, 

address phase of a first transaction _(T1) is occiuTing.on the ™ e address P hase deludes a tag assigned by the requesting 

aSr^bl^(rererence numeral 80). The^p^nsej)hase for a S ent ' and thc responding agent may transmit the same tag 

the first Jrrasaction o ccurs in clock c ycle CLK 2 (reference „ m the data P hase * the address aad data P hases ma y be 

nu^erall^TRT^aTaTIer^th the address^phasejof the first Haked - In one embodiment, the tag assigned to a given 

transaction, during clock cycle CLK 0, arbitration for the transaction may be freed upon transmission of the data, so 

adfesTbusls occurringlnd an agent^inlthe arbitration to that the ta S ma y beMgidl^je^^ 

perform a second transaction (T2) (reference numeral 84). t^Jpueues in the agents receiving.data,from^us,21may 

The corresponding address phase occurs in clock cycle CLK 30 b e designed to ca pture data using^giv^en tag once pj L queue 

1 (reference numeral 86) and the response phase occurs in entry. to that a reusedTag does not overwrite valid 

clock cycle CLK 3 (reference numeral 88). In parallel with datTfrom a previous transaction. 

the address phase of the second transaction during clock FIG. 3 further illustrates the coherency features of one 

cycle CLK 1, arbitration for the address bus is occurring and embodiment of bus 24. Coherency may be maintained using 

an agent wins the arbitration to perform a third transaction 35 signals transmitted during the response phase of each trans- 

(T3) (reference numeral 90). The corresponding address action. The response phase may be fixed in time with respect 

phase occurs in clock cycle CLK 2 (reference numeral 92) to the corresponding address phase, and may be the point at 

and the response phase occurs in clock cycle CLK 4 which ownership of the data affected by the transaction is 

(reference numeral 94). transferred. Accordingly, even though the data phases may 

Data phases for the transactions are illustrated in clock 40 be performed out of order (even if the transactions are to the 

cycles CLK N, CLK N+l, and CLK N+2. More particularly, same address), the coherency may be established based on 

the data phase for the second transaction is occurring during the order of the address phases. In the illustrated 

clock cycle CLK N (reference numeral 96). In parallel embodiment, the response phase is two clock cycles of the 

during clock cycle CLK N, an arbitration for the data bus is CLK clock ^ te r the corresponding address phase. However, 

occurring and an agent wins to perform the data phase of the 45 otner embodiments may make the fixed interval longer or 

first transaction (reference numeral 98). The corresponding shorter. 

data phase occurs in clock cycle CLK N+l (reference Turning next to FIG. 4, a block diagram is shown illus- 

numeral 100). In parallel during clock cycle CLK N+l, an trating exemplary signals which may be included on one 

arbitration for the data bus is occurring and an agent wins to embodiment of arbitration lines 28. Other embodiments are 

perform the data phase of the third transaction (reference 50 possible and contemplated. In the embodiment of FIG. 4, a 

numeral 102). The corresponding data phase occurs in clock set of address request signals (A_Req[7:0]) and a set of data 

cycle CLK N+2 (reference numeral 104). request signals (D_Req[7:0]) are included. Additionally, a 

Thus, the address arbitration, address phase, response set of block signals (Block[7:0]) may be included, 

phase, data arbitration, and data phase of various transac- The address request signals may be used by each request - 

tions may be pipelined. Accordingly, a new transaction may 55 ing agent to arbitrate for the address bus. Each requesting 

be initiated each clock cycle, providing high bandwidth. agent may be assigned one of the address request signals, 

Furthermore, in one embodiment, the data bus width is as and that requesting agent may assert its address request 

wide as the largest data transfer which may occur in signal to arbitrate for the address bus. In the illustrated 

r esponse to a singl ejr ansaction (e .g..a.cach&-blodc^wide, in embodiment, bus 24 may include a distributed arbitration 

one embodiment). Therefore, data transfers may occur in a 60 scheme in which each requesting agent may include or be 

single clock cycle in such an embodiment, again allowing coupled to an arbiter circuit. The arbiter c ircui t may_receive 

for high bandwidth of one new transaction each clock cycle. the address request signals, dj^rmin^^ 

Other embodiments may employ a narrower data bus, and wins~the ^b"i_u^ioi~based, onjmj^sujjtaffi 

may allow address transfers to last more than one clock schemeT and jndicateji^grari^ 

cycle. 65 jigejitjtfbne embodiment, each arbiter circuit may track the 

It is noted that, while the data phases of the transactions relative priority of other agents to the requesting agent, and 

in FIG. 3 are illustrated at a later time than the address may update the priority based on the winning agent (as 
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indicated by an agent identifier portion of the tag transmitted [5:0]) and a set gLexclusive signals (R_EXC[5:0]). Each 

during the address phase). a^emj^chrba rticipates in c oherenc y may be assig.ne.cLa 

The data request signal s may be used by each respo nding correspwiding^^^ 

age'nTn^arhitr-at e^for^hrd ata bus^ Each respondlnglgent spbnding one^of t he set of exclusiyjL signals.^The agenrmay 

ma^n5e~ass^ ne3 one of tEellat a req uesraprals7 Tndlh"at 5 reporrshaTed' ogmership of the data affected b y a transaction 

responding agent may assert its d ata requ est signa l to by jssertmg itsjihared^signaL The agent may report exclu- 

aSKr^fi^tfi^ bus sive "ownership" of the" dauTaff ected by a t ransaction by 

2<Tmay include a distributed arbitration scheme in which asserting its exclusive jjgpal. The agent may_jeport no 

each responding agent may include or be coupled to an ownership of the data^b yjiot assertin g other signal. In the 

arbiter circuit. The arbiter circuit may receive the data 30 illustrated embodiment, modified ownership - ^ treated as 

request signals, determine if the responding agent wins the exclusive. Other embodiments may employ a modified 

arbitration based on any suitable arbitration scheme, and signal (or an encoding of signals) to indicate modified, 

indicate a grant or lack thereof toj he responding agent. In Turning next to FIG. 7, a block diagram illustrating 

one~e1nDoamieln^ exemplary signals which may be employed on one embodi- 

priority of other agents to the responding agent^and may is ment of data bus 34 is shown. Other embodiments are 

upHate tfiTpTiorit;f b~ase~d on th^wnnirjg.age^(asjndtoted possible and contemplated. In the embodiment of FIG. 7, 

\^ by an~aj ^nde^^ the data phas e). data bus 34 includes data lines (Data[255:0]) used to transfer 

^he, block sj gnals.maybe used b yjtgsnts jo indicate a lac k the data, a transaction ID (D_ID[9:0]) similar to the trans- 

■ ' ofab ility to participate many new transactions (e.g. due to action ID of the address phase and used to match the address 

queue fullness within that agent). If an a gent cannot accep t 20 phase with the corresponding data phase, a responder ID 

n ew transaction s, .it may -assert its.oio Asig nal. ^Requesting (D_RSP[3:0]), a data code (D_Code[2:0]), and a modified 

agents may receiv e" the block signals jmd may inhibit signal (D_Mod). 

init iating a transactiorTin . which Jha,t„agenCparfcipjMes The responder ID is the agent identifier of the responding 

responsiy.ejcjyheJ^ A transaction in w hichj hat agent who arbitrated for the data bus to perform the data 

^agentjoes-noLpartici pate mav_ be init iated. 25 transfer, and may be used by the data bus arbiter circuits to 

\^ Other embodiments may employ a centralized arbitration update arbitration priority state (i.e. the responder ID may be 

. ^iScheme. Such an embodiment may include _address gr ant an indication of the data bus arbitration winner). The data 

signals-for each requesting agent and data grant signals for code may be used to report various errors with the transac- 

each responding agent, to be asserted by the central arbiter tion (e.g. single or double bit error checking and correction 

^ to the winning agent to indicate grant of the bus to that 30 (ECC) errors, for embodiments employing ECC, unrecog- 

requesting or responding agent. nized addresses, etc.). The modified signal (D_Mod) may 

Turning next to FIG. 5, a block diagram illustrating be to indicate, if an agent reported exclusive status, 

exemplary signals which may be included on address bus 30 whether or not the data was modified. In one embodiment, 

is shown. Other embodiments are possible and contem- an a S ent which reports exclusive status supplies the data, 

plated. In the illustrated embodiment, address bus 30 and the modified indication along with the data, 

includes address lines used to provide the address of the It is noted that, while various bit ranges for signals are 

transaction (Addr[39:5]) and a set of byte enables illustrated in FIGS. 4-7, the bit ranges may be varied in 

(A_BYEN[31:0]) indicating which bytes on the data bus.34 other embodiments. The number of request signals, the size 

are being read or written during the trarisaction,<^Sir^ano^ 40 of the agent identifier and transaction ID, the size of the 

(A_CMD[2:0]) used to indicate the transac tion to^b e~pet^ address bus, the size of the data bus, etc., may all be varied 

formed (read, write, etc.), a transaction iUl7QD[fr Ol TiIsed according to design choice. 

tojdej^yjhejra nsaction, a^X P^t'pf^Sb- 11 ^ 5 1^— AI J ^ Turning next to FIG. 8, a block diagram is shown illus- 
ion: 0]). trating differential pairs of signals which may be used 

The transaction ID may be used to link the address and 45 according to one embodiment of bus 24. Other embodiments 

data phases of the transaction. More particularly, the are possible and contemplated. Two bits of the address lines 

res pongUn gj^enjjBay_use. the .value, provided on the tran s- (Addr[39] and Addr[38]) are shown in FIG. 8. Each signal 

action ID as the transaction JtD^for~the~data phase. on bus 24 may be differential, in one embodiment. Other 

Accordingly^ tfielransactiolTlD may be a tag for thejrans- embodiments may use differential pairs for any subset of the 

action.. A portion of the transaction ID„.is M.agenUdentifier 50 signals on bus 24, or no signals may be differential pairs, 

identifying the requesting agent. For example, the agent In the illustrated example, differential pair of lines 110A 

identifier may be bits 9:6 of the transaction ID. Each agent and HOB are used to transmit Addr[39] and differential pair 

is assigned a different agent identifier. of fines 112 A and 112B are used to transmit Addr[38]. Lines 

The set of attributes may include any set of additional 110A-110B will be discussed, and lines 112A-112B may be 

attributes that it may be desirable to transmit in t he addres s 55 used similarly (as well as other differential pairs correspond- 

phase. For example, the attributes may inc lude a cache abil^ ing to other signals). 

ity indicator indicating whether or not the transaction™is Lines 110A-110B may be precharged during the pre- 

. cac^elBTewitlun tne requestifigaj^t, a : cbEerency indicator charge time illustrated in FIG. 2. For example. Lines 

» ' yV indicatingwhettier j' f horth TtraDsacfion is to be performed 110A-110B may be precharged to a high voltage. One of 

coherently, and aTcac^abilitv^ indicator for L2 cache ^!4. 60 lines 110A-110B may be driven low based on the value of 

Other embodimems°Tr?ay employ more, fewer, or other Addr[39] desired by the driving agent. If Addr[39] is to 

attributes, as desired. transmit a logical one, line 11 OA may be driven low. If 

Turning next to FIG. 6, a block diagram illustrating Addr[39] is to transmit a logical zero, line HOB may be 

exemplary signals which may be employed on one embodi- driven low. Receiving agents may detect the difference 

ment of response lines 32 is shown. Other embodiments are 65 between lines 110A-110B to determine the value driven on 

possible and contemplated. In the embodiment of FIG. 6, Addr[39] for the transaction. Alternatively, lines 

response fines 32 include a set of shared signals (R_SHD 110A-U0B may be precharged to a low voltage and one of 
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the lines 110A-110B may be driven high based on the value 
of Addr[39] desired by the driving agent. 

Turning next to FIG. 9, a block diagram of a carrier 
medium 300 including a database representative of system 
10 is shown. Generally speaking, a carrier medium may s 
include storage media such as magnetic or optical media, 
e.g., disk or CD-ROM, volatile or non-volatile memory 
media such as RAM (e.g. SDRAM, RDRAM, SRAM, etc.), 
ROM, etc., as well as transmission media or signals such as 
electrical, electromagnetic, or digital signals, conveyed via 10 
a communication medium such as a network and/or a 
wireless link. 

Generally, the database of system 10 carried on carrier 
medium 300 may be a database which can be read by a 
program and used, directly or indirectly, to fabricate the 5 
hardware comprising system 10. For example, the database 
may be a behavioral-level description or register-transfer 
level (RTL) description of the hardware functionality in a 
high level design language (HDL) such as Verilog or VHDL. 
The description may be read by a synthesis tool which may 2Q 
synthesize the description to produce a netlist comprising a 
list of gates from a synthesis library. The netlist comprises 
a set of gates which also represent the functionality of the 
hardware comprising system 10. The netlist may then be 
placed and routed to produce a data set describing geometric 25 
shapes to be applied to masks. The masks may then be used 
in various semiconductor fabrication steps to produce a 
semiconductor circuit or circuits corresponding to system 
10. Alternatively, the database on carrier medium 300 may 
be the netlist (with or without the synthesis library) or the 3Q 
data set, as desired. 

While carrier medium 300 carries a representation of 
system 10, other embodiments may carry a representation of 
any portion of system 10, as desired, including any set of one 
or more agents (e.g. processors, L2 cache, memory 3S 
controller, etc.) or circuitry therein (e.g. arbiters, etc.), bus 
24, etc. 

Numerous variations and modifications will become 
apparent to those skilled in the art once the above disclosure 
is fully appreciated. It is intended that the following claims 40 
be interpreted to embrace all such variations and modifica- 
tions. 

What is claimed is: 

1. A system comprising: 

a bus; and 45 
an agent coupled to said bus and to receive a clock signal 
for said bus, said clock signal having a rising edge and 
a falling edge during use, wherein said agent is con- 
figured to drive one or more signals on said bus 
responsive to a first edge, the first edge being one of 50 
said rising edge or said falling edge, and wherein said 
agent is configured to sample a value on said bus 
responsive to a second edge, the second edge being the 
other one of said rising edge or said failing edge. 

2. The system as recited in claim 1 wherein said first edge 55 
is said rising edge and said second edge is said falling edge. 

3. The system as recited in claim 1 wherein said first edge 
is said falling edge and said second edge is said rising edge. 

4. The system as recited in claim 1 wherein said agent is 
configured to terminate driving said one or more signals 60 
responsive to said second edge. 

5. The system as recited in claim 4 wherein said agent is 
configured to precharge said one or more signals during a 
period of time between an occurrence of said second edge 
and a subsequent occurrence of said first edge. 65 

6. The system as recited in claim 1 wherein said agent is 
configured to evaluate said value to determine if said agent 
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is to drive said one or more signals responsive to said second 
edge, and wherein said agent is configured to drive said one 
or more signals on a next occurrence of said first edge 
responsive to evaluating said value. 

7. The system as recited in claim 1 wherein said bus 
comprises a differential pair of conductors for a first signal. 

8. The system as recited in claim 7 wherein said agent is 
configured to drive one of said differential pair to drive said 
first signal responsive to a first value to be driven on said first 
signal. 

9. The system as recited in claim 8 wherein said agent is 
configured to drive said one of said differential pair low. 

10. The system as recited in claim 1 wherein said bus 
comprises an address bus, a data bus, and response lines, and 
wherein said response lines carry signals to maintain cache 
coherency with respect to transactions on said bus, and 
wherein a transmission of data corresponding to two or more 
transactions on said data bus is capable of occurring out of 
order with respect to a transmission of addresses corre- 
sponding to said two or more transactions on said address 
bus. 

11. The system as recited in claim 10 wherein transmis- 
sion of a response corresponding to a first transaction on said 
response lines of said bus is fixed in time with respect to 
transmission of an address corresponding to said first trans- 
action on said address bus. 

12. A method comprising: 

driving a value on a bus responsive to a first edge of a 
clock signal for said bus, said first edge being one of a 
rising edge or a falling edge of said clock signal; and 

sampling said value from said bus responsive to a second 
edge of said clock signal, the second edge being the 
other one of said rising edge or said falling edge. 

13. The method as recited in claim 12 wherein said first 
edge is said rising edge and said second edge is said falling 
edge. 

14. The method as recited in claim 12 wherein said first 
edge is said falling edge and said second edge is said rising 
edge. 

15. The method as recited in claim 12 further comprising 
terminating said driving said value responsive to said second 
edge. 

16. The method as recited in claim 15 further comprising 
precharging said bus during a period of time between an 
occurrence of said second edge and a subsequent occurrence 
of said first edge. 

17. The method as recited in claim 12 further comprising: 
evaluating said value responsive to said second edge; and 
driving a second value on a next occurrence of said first 

edge responsive to said evaluating said value. 

18. The method as recited in claim 12 wherein said bus 
comprises a differential pair of conductors for a first signal. 

19. The method as recited in claim 18 wherein said 
driving said value comprises driving one of said differential 
pair to drive said first signal responsive to said value. 

20. The method as recited in claim 19 wherein said 
driving said one of said differential pair comprises driving 
said one of said differential pair low. 

21. The method as recited in claim 12 wherein said bus 
comprises an address bus, a data bus, and response lines, and 
wherein said response lines carry signals to maintain cache 
coherency with respect to transactions on said bus, and 
wherein a transmission of data corresponding to two or more 
transactions on said data bus is capable of occurring out of 
order with respect to a transmission of addresses corre- 
sponding to said two or more transactions on said address 
bus. 
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22. The method as recited in claim 21 wherein transmis- 
sion of a response corresponding to a first transaction on said 
response lines of said bus is fixed in time with respect to 
transmission of an address corresponding to said first trans- 
action on said address bus. 

23. The method as recited in claim 12 wherein said first 
edge and said second edge are successive edges of said clock 
signal. 

24. The system as recited in claim 1 further comprising a 
second agent coupled to said bus and to receive said clock 
signal, wherein said second agent is configured to sample a 
second value driven by said agent, said agent driving said 
second value responsive to said first edge and said second 
agent sampling said second value responsive to said second 
edge, wherein said first edge and said second edge are 
successive edges of said clock signal. 

25. A carrier medium comprising a database which is 
operated upon by a program executable on a computer 
system, the program operating on the database to perform a 
portion of a process to fabricate an integrated circuit includ- 
ing circuitry described by the database, the circuitry 
described in the database including: 

a bus; and 

an agent coupled to said bus and to receive a clock signal 
for said bus, said clock signal having a rising edge and 
a falling edge during use, wherein said agent is con- 
figured to drive one or more signals on said bus 
responsive to a first edge, the first edge being one of 
said rising edge or said falling edge, and wherein said 
agent is configured to sample a value on said bus 
responsive to a second edge, the second edge being the 
other one of said rising edge or said falling edge. 

26. The carrier medium as recited in claim 25 wherein 
said first edge is said rising edge and said second edge is said 
falling edge. 

27. The carrier medium as recited in claim 25 wherein 
said first edge is said falling edge and said second edge is 
said rising edge. 
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28. The carrier medium as recited in claim 25 wherein 
said agent is configured to terminate driving said one or 
more signals responsive to said second edge. 

29. The carrier medium as recited in claim 28 wherein 
5 said agent is configured to precharge said one or more 

signals during a period of time between an occurrence of 
said second edge and a subsequent occurrence of said first 
edge. 

30. The carrier medium as recited in claim 25 wherein 
said agent is configured to evaluate said value to determine 
if said agent is to drive said one or more signals responsive 
to said second edge, and wherein said agent is configured to 
drive said one or more signals on a next occurrence of said 
first edge responsive to evaluating said value. 

31. The carrier medium as recited in claim 25 wherein 
15 said bus comprises a differential pair of conductors for a first 

signal. 

32. The carrier medium as recited in claim 31 wherein 
said agent is configured to drive one of said differential pair 
to drive said first signal responsive to a first value to be 

20 driven on said first signal. 

33. The carrier medium as recited in claim 32 wherein 
said agent is configured to drive said one of said differential 
pair low. 

34. The carrier medium as recited in claim 25 wherein 
25 said bus comprises an address bus, a data bus, and response 

lines, and wherein said response lines carry signals to 
maintain cache coherency with respect to transactions on 
said bus, and wherein a transmission of data corresponding 
to two or more transactions on said data bus is capable of 
occurring out of order with respect to a transmission of 
addresses corresponding to said two or more transactions on 
said address bus. 

35. The carrier medium as recited in claim 34 wherein 
transmission of a response corresponding to a first transac- 
tion on said response lines of said bus is fixed in time with 

35 respect to transmission of an address corresponding to said 
first transaction on said address bus. 

***** 
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